Transfer and multi-task learning have traditionally focused on either asingle source-target pair or very few, similar tasks. Ideally, the linguisticlevels of morphology, syntax and semantics would benefit each other by beingtrained in a single model. We introduce a joint many-task model together with astrategy for successively growing its depth to solve increasingly complextasks. Higher layers include shortcut connections to lower-level taskpredictions to reflect linguistic hierarchies. We use a simple regularizationterm to allow for optimizing all model weights to improve one task's losswithout exhibiting catastrophic interference of the other tasks. Our singleend-to-end model obtains state-of-the-art or competitive results on fivedifferent tasks from tagging, parsing, relatedness, and entailment tasks.
展开▼